UMT Artificial Intelligence Review (UMT-AIR) 
Volume 2 Issue 1, Spring 2022 

ISSN): 2791-1276 ISSN): 2791-1268 

Homepage: https://journals.umt.edu.pk/index.php/UMT-AIR 


me aE] 


UMT-AIR®? 


: Article QR 
Title: Role of GraphDB in FinTech, Blockchain Ledgers 
Author (s): Hassan Kaleem!, Sundas Rukhsar!, Wagar Ahmad?, Hafiz Ali Haris? 


Affiliation (s): 19 Frances Street Crewe, England 
?Electronic Government Authority RAK, UAE 
3Communication & Works Department, GOP, Pakistan 


DOI: https://doi.org/10.32350.umt-air.2 1.03 
History: Received: March 10, 2022, Revised: April 25, 2022, Accepted: June 10, 2022 

ee H. Kaleem, S. Rukhsar, W. Ahmad, and H. A. Haris, “Role of GraphDB in 
Citation: 


FinTech, blockchain ledgers,” UMT Artif. Intell. Rev., vol. 2, no. 1, pp. 00—00, 
2022, doi: https://doi.org/10.32350.umt-air.2 1.03 


Copyright: © The Authors 

Licensing: This article is open access and is distributed under the terms of 
By Creative Commons Attribution 4.0 International License 

Conflict of 

Interest: Author(s) declared no conflict of interest 


A publication of 
Department of Information System, Dr. Hasan Murad School of Management 
University of Management and Technology, Lahore, Pakistan 


Role of GraphDB in FinTech, Blockchain Ledgers 
Hassan Kaleem!*, Sundas Rukhsar ', Waqar Ahmad’, Hafiz Ali Haris ° 


' Frances Street Crewe, England 
? Electronic Government Authority RAK 
3 Communication & Works Department, GOP, Pakistan 


Abstract— GrpahDB stores data in 
nodes and edges, nodes represent 
entities and edges represents the 
relationship between entities. The 
role of GraphDB in the blockchain is 
described as blockchain uses blocks 
and these blocks are connected 
through hashcode to store the data. 
In cipher language, hash is the 
irreversible conversion of data 
which makes it impossible to 
decrypt. Blockchain also uses proof 
of work system, in which data is 
entered only if maximum people 
allows verifies it. And once anything 
entered into ledger, it cannot be 
altered or deleted. The paper has 
provided how hashing & indexing, 
query processing, transaction 
management, data management and 
data distribution is done for 
GraphDB into ledger, with 
previously done work and libraries 
to build and manage GraphDB 
blockchain. 


Index Terms-GraphDB in 
FinTech, Neo4j, Hashing & 
Indexing, Query Processing, 
Transaction Management, Data 


Management, Data Distribution 


I.Introduction 


Blockchain is a peer to peer 
network that uses digital ledger 
system that is maintained by 
different computer in a network. It 
creates block and these blocks are 
linked together through hash 
function [1], in cryptographic world 
Cipher is an algorithm which 
encrypts and decrypts the data but 
Hash Code is an irreversible 
conversion data, which makes 
blockchain network more secure. 
Because once you have entered the 
data into blockchain ledger which is 
the database of blockchain you 
can’t alter or delete the data from it. 
Blockcahin is used for financial 
transactions between two parties 
without the involvement of any 
third party for example banks. 
Blockchain was first developed by 
Satoshi Nakamoto [2]. GraphDB 
[3] is a type of NoSql database 
which uses nodes and edges to store 
its data in the database. Nodes and 
Edges represents the object and its 
properties. Relational database [4] 
is not flexible because once you 
have created a relational database 
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it’s unless you changed it by 
yourself which requires extra 
efforts and time and it’s not a good 
practice. GraphDB solves this 
problem because it’s easy to add 
new attributes and relationships in 
the database. But when we develop 
GraphDB in blockchain few 
problems are related with the 
security of Ledger [5]. First, there 
was bitcoin [2] and then came 
Ethereum [6] which is much more 
programmable then bitcoin. But 
both of these uses its own currency 
as a medium of exchange but 
HperLedger Fabric [7] does not 
own its currency, since bitcoin and 
Ethereum are open networks 
anybody can connect with this 
network, Hyperledger Fabric is a 
private network which can only be 
accessed by the people who are the 
part of the network. And it is used 


Fig. 1. Nodes 


for developing blockchain-based 
applications that can be used within 
a private organization. The research 
provides a discussion on the 
motivation and benefits of the 
techniques adopted in recent 2017- 
2022 GraphDB models for Hashing 
& Indexing, Query Processing, 
Transaction Management, Data 
Management and Data Distribution 
for Blockchain. 


II.Background Material 


In this section, the first thing get 
to know the graph database. 
GraphDB [3] is a type of NoSql 
database which uses nodes and 
edges to store its data in the 


database. Nodes and Edges 
represents the object and its 
properties. 
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A. GraphDB 


The importance of storing the 
data in GraphDB is explained in the 
following research [8], [9]. The 
GraphDB consists of two elements, 
first edge and second node. Nodes 
and Edges represents the object and 
its properties. Many systems are 
using GraphDB now, twitter is great 
example of GraphDB [10]. For 
example, in Figure 1 “burger” and 
“food” nodes would have the 
relationship, here’s each node has 
different attributes. There are 
different models for GraphDB [11] 
such as, 


AllergoGraph provides special 
features for analysis on social 
network. DEX which is based on 
Java libraries for management of 
temporary graph. DEX ensures the 
good performance for vary large 
scale database. HyperGraphDB, it 
is useful for modeling the data like 
Artificial Intelligence and Bio- 
Informatics. InfiniteGraph, it is 
useful for analysis in business, 
social and government intelligence. 
Neo4j is an open source GraphDB, 
which uses node, edges and its 
attributes to store data. Sones is a 
GraphDB that has its own graph 
query language. 

III. GrpahDB in Blockchain 

Blockchain is a peer to peer 
network that uses digital ledger 
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system that is maintained by 
different computer in a network. 
The goal of this research is to 
discuss the motivation and benefits 
of the techniques adopted in recent 
2017-2022 GraphDB models for 


Hashing & Indexing, Query 
Processing, Transaction 
Management and Data 


Management. Study showed [12] 
that 256 GB of Bitcoin’s data will 
require 258*0.33=86 GB of storage 
[13]. Thus, it will result in high 
performance. 


A. Hasing & Indexing 


Indexing is way of sorting 
numbers of records in different 
fields. And Hashing is used to 
retrieve them using a shorter hashed 
key. Hashing & Indexing is used in 
different contexts in GrpahDB [14]. 
Due to existence of property values 
in GraphDB, there are two kinds of 
graph indexes, structure based and 
value based. They occur in 
GraphDB in different forms for 
example full text querying support. 
Some search engines like Apache 
Lucene used in GraphDB as index 
backend. 


In Value-Based Indexing the 
research presents indexing with 
three different Graph Database 
Management Systems (GDBMS). 
The Cypher language of Neo4j 
allows indexing for one or multiple 
for all nodes, but only for nodes 
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who has given labels [15]. Sparksee 
uses compressed bitmap indexes 
and B+Trees to store nodes and 
edges and with their properties. For 
faster query processing titan 
supports two different kinds of 
indexing mechanism, node-centric 
and graph indexing. In node-centric 
indexing, these are local index 
structure which are built on nodes 
individually. The graph indexes 
allows enhanced retrieval of nodes 
and edges from their properties on 
selection conditions. Some 
GDBMS are multi-model which 
supports graph data model, key 
value and document for example 
OrientDB. It uses indexing 
algorithms like SB-Tree [16]. All of 
these value based indexes, Neo4j’s 
efficacy is less than OrientDB on 
some of its nodes. So it is 
recommended for large number of 
nodes to use OrientDB [17]. 


In Structure-Based Indexing the 
structure-based indexing is used to 
index and extract structural 
properties of GraphDB, generally at 
the time of insertion and in response 
to a query. Many methods take path 
for example SPath algorithm [18], 
which focus on a path-based 
technique for indexing for local 
nodes in G in order to transform a 
query graph into a set of a shortest 
path for query processing. Srinivasa 
distinguishes [19] three type of 


indexes based on structure, which 
are path-based, index-based and 
spectral methods. Spectral methods 
uses the concept of spectral graph 
theory. But no Index structure 
supports all king of substructure 
features. Researchers’ [20] 
purposes a Lindex, a kind of graph 
index which allows indexes to 
subgraphs contained in GraphDB. 
Similarly feature-based graph index 
techniques can be found in [21]. In 
[18], the researchers introduced two 
indexing techniques, structure- 
aware indexing and attribute-aware 
indexing to process graph matching 
for property graph. 


Index Graph patterns are a 
pointer based data structure that 
stores for the reference of graph 
patterns. Several number of graph 
patterns are found on different 
GraphDBs. Index graph have 
different values for patterns that is 
based on the kind of data which is 
stored in GraphDB, and use cases 
are included in these graph patterns. 
One of the most popular graph 
pattern, defined as 


GP = (Vp; Ep), where V = {v1; v2; 
v3} and E = {(v1; v2); (v2; v3); (v3; 
v1)}, is called a triangle. In Cipher 
triangle can be described in a few 
different ways, for example 


(n1) - [r1] - (n2) - [r2] - (n3) - [r3] - 
(nl) 
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Is_Friend_of 


Person 


Is_Friend_of 


Fig. 2. Cipher triangle 


Figure 2 shows a social group. 
To retrieve that kind of a pattern is 
easy for Neo4j. But the problem 
arises when we focus on structural 
features of the graph and require all 
the triangle who has friendship 
features in them. In that kind of case 
structural-based indexing is more 
suitable. 


B. Query Processing 


Neo4j [3] is an open source 
NoSqI language. It stores the data in 
nodes and edges format. Nodes are 
the entities in GraphDB that store 
attributes. For fast query processing 
results Neo4j offers multiple 
libraries [22]. With almost zero 
coding, these algorithms can reveal 
the hidden patterns in GraphDB. 
These features makes Neo4j an 
ideal tool for analyzing and 
visualizing the data in blockchain 
ledgers. Moreover, Neo4j offers 
multiple graph models to support 
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Person 


Name: Jim 


Name: Sam 


Is_Friend_of 


the user’s requirements. Cypher 
[23] a Declarative Query Language 
(Dql) similar to Structured Query 
Language (Sql), but Dql is 
optimized for GraphDB. And now 
this language is used by SAP 
HANA [24]. RedisGraph [25] and 
Open-Cypher [26], Cypher Queries 
are available at Neo4j [27]. 


The following example of 
Cypher Query [12], carried out to 
return all the number of tokens that 
have been sent on “Address” with 
transactions. These transactions 
include the chain extended from a 
block of hash. 


MATCH (b:Block{hash:Hash}) - 
[:CHILD_OF*0...] -> 0 - [K] - 
(t: Transactions) - [:TO] -> 


(u:Users {hex:’ Address’ }) 
RETURN SUM(t.amount) 


Unlike the classical structure of 
the blockchain, these GraphDB 


Volume 2 Issue 1, Spring 2022 


at 
4 A 
O; 


UMT— 29 


Role of GraphDB... 


models allows bidirectional pass 
over of entities in the requested 
path. With the help of Neo4j, we 
can analyze and evaluate just by 
looking at the execution plan of the 
query. In case of any Casper's [28] 
event, Casper's usefulness is based 
on its capacity to detect and 
penalize bad validators who took 
advantage of any vulnerability. We 
reward network nodes to track and 
report those in our Casper-like 
protocol. Criminals by providing 
cash incentives in the event of a 
well-executed slashing 
Furthermore, any evidence in 
support of a rule any node may 
detect and recover from a breach 
since they are all connected. 


MATCH — (v1:Vote), (v2:Vote) 
WHERE vl.r from = v2.r from 
AND vl.target_height = 
v2.target height AND ID(vl) < 
ID(v2) RETURN v1.r_ from, v1, v2 


Returns all separate pairs of 
votes, vl and v2, submitted by the 
same validator with targets at the 
same height. 


MATCH — (v1:Vote), (v2:Vote) 
WHERE vl.r from = v2.r_from 
AND vl.target_height > 
v2.target_height AND 
vl.source_height < 
v2.source height AND NOT 
ID(vl) = ID(v2) RETURN 


vl.r_from, vl, v2 


Returns all separate pairs of 
votes; vl and v2, sent by the same 
validator, where one vote is inside 
the span of the other. 


The above queries, are not for a 
specific branch, these are for the 
whole blockchain tree. 


C. Transaction Management 


Cypher is a query language that 
based on patterns, and cypher is 
specially designed to recognize 
these patterns. Common keywords 
in cypher are, MATCH, WHERE 
and RETURN. These keywords are 
similar to Sql but with a slight 
difference, although they are very 
similar. There are 4 properties of 
Transaction Management by using 
Neo4j GraphDB. ACID are 
described below, 


1. Atomicity: in case of any 
transaction failure, data is left 
unchanged. 

2. Consistency: all the transaction 
will leave the database in a 
consistent state. 

3. Isolation: similar to locking 


technique, during the 
transaction data cannot be 
accessed by any other 
operations. 

4. Durability: The database 
management system can 


recover the results of a 
committed transaction. 
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Transaction Management in 
Neo4j [29], code in java is available 
at Github [30]. According to Neo4j, 
there are 7 steps to manage a 
transaction in GraphDB. 


1. Interaction cycle ensures that it 
fulfil ACID properties or not, 
for example begin a transaction, 
performing a database operation 
and commit or roll back in 
transaction. 

2. Isolation levels, it means that if 
the transaction is under process 
it is not available for other 
processes until it’s completely 
committed. 

3. Default Locking behavior. It is 
similar to relational database. 

4. Deadlock. Neo4j can detect any 
before they even happen and 
throw an exception, Deadlock 
code is available at Github [31]. 

5. Delete Semantics, Neo4j 
provide semantics that every 


Makes 


e aa 


Purchased 


Accepted 


Notifies 


Fig. 3. Data management 
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data node should have a start 
node and end note. It means that 
if we try to delete a node that 
still has a relationship with 
another node, it will throw an 
exception. 

6. Creating unique Nodes, certain 
level of uniqueness is required 
in nodes. To ensure this there 
are two techniques to ensure the 
techniques, single thread and 
get or create. 

7. Transaction Event, it notified in 
case of any transaction event in 
GraphDB through a Transaction 
Event Listener [32]. 


D. Data Management 


Decentralized nature of 
blockchain allows data to be 
accessed within the organization 
easily. Following figure describes 
how data is managed in GraphDB. 


se 
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Ships To 


Purchases 5 oae) 


Ships From 
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Benefits of Blockchain for Data 
Management 


1. 


Data Security: blockchain uses 
hash values = which is 
irreversible conversion of data, 
it provide maximum level of 
security as compared to any 
encryption algorithms. 

Data Quality: data is stored in 
edges and if any data is entered 
into the ledger, it is analyzed 
and cross-checked before it 
stored in the database. 

Data Traceability: data can be 
traced through tokens and as 
blockchain is linear, an 
historical chain of events can be 
followed easily. 

Real-Time Data Analysis: due 
to blockchain nature, businesses 


Fig. 4. Data distribution [38] 
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can easily notice any kind of 
irregularities as they occur and 
they can be viewed. 

5. Data Sharing: blockchain nature 
allows data to be shared within 
the organization easily and 
permissions can be set to whom 
can read or edit the data. 


Following are the studies and 
tools to build and manage the data 
in ledger [33]-[36]. 

E. Data Distribution 

According to Jim Webber chief 
scientist of Neo4j states that, Neo4j 
is going distributed in GraphDB 
[37]. For scalability Neo4j has a 


limit. With the launch of 4.0 Neo4j 
is going to be distributed GraphDB. 


beserened 
Graphs 
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With Neo4j, Webber says the 
customer has to decide how to spilt- 
up or share their data into a 
distributed environment, or sub- 
GraphDB. But in a Fabric server, 
they can be queried as a single 
entity. Neo4j) is following 
MongoDB [39] model for this. In 
this model, customers can split their 
data as per their business function 
and geographical region 
requirement. 


IV. Conclusion 


The study highlighted the 
features of GraphDB in financial 
technologies. GraphDB stores data 
in nodes and edges. Nodes stores 
data entities and edges stores 
relationship between entities. The 
study has provided Hashing and 
Indexing techniques for GrpahDB 
in blockchain. There are two types 
of Hashing, the first is value-based 
which uses Cypher query language, 
and the second structure based 
which uses SPath algorithm. The 
study then presented how queries 
are performed in GraphDB 
blockchain. For query processing, 
Neo4j has provided multiple 
libraries, unlike traditional 
databases GrpahDB allows 
bidirectional passes between 
entities. In the transaction 
management section, the cypher 
query language is used to recognize 
patterns, Java code is available for 
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this. Data Management section 
describes how data is stored and 
managed in database. This study 
also provided solution in how to 
build and manage the data in 
GrpahDB blockchain. 
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